{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 实例13：用Python批量查询采购物料交期及实际收货日期"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们在实例10中介绍了工程部采购物料用于新产品开发，如何从杂乱的备注栏获取工程采购单号。获取到采购单号，只是第一步，随后还需要查询物料的交期和实际收货日期。\n",
    "\n",
    "对于负责物料跟进的员工，每个新产品开发订单对应物料的交期都至关重要。物料清单中哪怕一个小小的物料无法即时到货，都可能导致成品无法准时交付，引起客户的抱怨及扣款，甚至丢掉生意。\n",
    "\n",
    "公司的系统分为采购系统和物料接受系统。采购系统是采购部同事下单的时候使用的，里面会记录与供应商谈好的交期。当物料到厂后，货仓的同事会将物料收入库存之中，即在物料接受系统内登记物料名称及数量等信息。\n",
    "\n",
    "每天公司都有一些新产品订单产生，每个订单对应一长串物料清单，这些物料的交期及实际收货日期需要跟进，如果逐个查询，将会花费大量的时间和精力。下面我们就演示使用Python来进行批量查询并写入到对应订单物料清单Excel表中。\n",
    "\n",
    "我们的数据来源于两张Excel表，“采购信息表”和“收货信息表”。我们需要通过物料的编号和对应采购单号（EPR编号）去两张表中分别获取交期和实际收货日期。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "#使用正则表达式获取EPR编号\n",
    "from re import compile, IGNORECASE\n",
    "def get_EPR_num(text):\n",
    "    #将问号'?' 替换成 '-'\n",
    "    changed_question_mark = text.replace(\"?\",\"-\")\n",
    "    #获取EPR编号\n",
    "    epr_num = compile(r'EPR-\\d{2}-X\\d{3}(\\d)?',IGNORECASE)\n",
    "    match = epr_num.search(changed_question_mark)\n",
    "    #将小写EPR转换为大写并返回\n",
    "    return match.group().upper()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "以上定义的函数来自实例10，用于从采购信息表乱七八糟的备注栏中提取出正确的EPR编号。详情可移步实例10：https://zhuanlan.zhihu.com/p/62359845"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [],
   "source": [
    "from openpyxl import load_workbook\n",
    "#获取交期，并存入字典\n",
    "def get_promise_date():\n",
    "    wb = load_workbook(\"data/采购信息表.xlsx\")\n",
    "    ws = wb.active\n",
    "    promise_dates = {} #存储零件对应的EPR编号及交期\n",
    "    for row in range(2, ws.max_row+1): #从第2行开始遍历采购信息表\n",
    "        part_num = ws['A' + str(row)].value #A列为零件编号\n",
    "        promise_date = ws['B' + str(row)].value.date() #B列为交期\n",
    "        remark = str(ws[\"D\" + str(row)].value) #D列为备注栏，其中包含EPR信息\n",
    "        promise_dates.setdefault(part_num, {}) #设置数据结构为字典套字典，外层字典的键为零件编号\n",
    "        EPR = get_EPR_num(remark)# 调用上面写好的正则表达式函数，从备注栏提取EPR编号\n",
    "        promise_dates[part_num].setdefault(EPR, promise_date)#内层字典以EPR编号为键，交期为值\n",
    "    return promise_dates"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "以上程序用于从“采购信息表”中提取交期信息。采购信息表如下图所示，它包含零件号，承诺交期，数量和备注栏。其中EPR编号就在备注栏中。当采购部门按照工程师的EPR申请下采购单后，就会在系统记录下这么一条采购信息，若系统无相关信息，则表示采购还未下单。\n",
    "![](images\\purchase_info.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们在提取交期信息的时候，必须同时对应到零件号和EPR编号。因为采购信息表里面可能存在相同的零件号对应多条信息。但是通过常规采购的，那就没有EPR编号（有EPR编号的都是工程部的特殊采购）。或者工程部对一个零件号特殊采购了两次，那就会有两个EPR编号。所以我们提取的信息的数据结构需要是这样子：\n",
    "![](images\\data_structure.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "即一个大字典`promise_dates`内部含有嵌套字典。大字典的键全部为零件号，其值也是一个字典。嵌套的小字典用“EPR”编号作为键，值为“交期”，零件号对应的一条或多条EPR信息都放入这个字典。其中`setdefault()` 函数在此非常关键，`promise_dates.setdefault(part_num, {})`的意思是，如果键`part_num`不存在于字典`promise_dates`中时，将会添加这个`part_num`为键，并将值设为空字典`{}`；如果键`part_num`已经存在，将不会新建这个键，因为字典的键必须是唯一值，再新建会覆盖之前的数据。随后程序继续运行，我们通过写好的函数提取正确的EPR编号`EPR = get_EPR_num(remark)`。然后将对应的EPR和其交期放入嵌套的空字典`{}`中，即`promise_dates[part_num].setdefault(EPR, promise_date)`。此处嵌套字典也使用`setdefault()` 函数是为了兼容多个交期对应一个EPR编号的情况。\n",
    "\n",
    "比如采购信息表中的第一个零件号是'60022-5054001'，因为这是第一个数据，'60022-5054001'肯定不在字典`promise_dates`中，所以此时就在字典`promise_dates`中新建了这个键'60022-5054001'，其值为空字典`{}`。然后获取其第一个EPR编号，即`'EPR-18-X1208'`，然后将其对应的交期一起作为键值对放入空字典中，这条信息就变为`'60022-5054001': {'EPR-18-X1208': datetime.date(2018, 6, 15)}`。随后`for`循环再检查下一行，这一行的零件号还是'60022-5054001'，因此不用新建一个键，继续在之前的键'60022-5054001'里面追加数据即可，于是再将该零件号对应的第二个EPR对应的交期放入，信息变为`'60022-5054001': {'EPR-18-X1208': datetime.date(2018, 6, 15),\n",
    "  'EPR-18-X1209': datetime.date(2018, 6, 16)}`。然后继续遍历下一行，直到结束。我们可以调用一下这个函数，获取到的信息如下。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'60022-5054001': {'EPR-18-X1208': datetime.date(2018, 6, 15),\n",
       "  'EPR-18-X1209': datetime.date(2018, 6, 16)},\n",
       " '8810020-01844': {'EPR-18-X713': datetime.date(2018, 6, 13)},\n",
       " '8810030-01125': {'EPR-18-X713': datetime.date(2018, 8, 15)},\n",
       " '8810140-00042': {'EPR-18-X713': datetime.date(2018, 6, 15)},\n",
       " '8812121-02142': {'EPR-18-X941': datetime.date(2018, 8, 15)},\n",
       " '8836012-04093': {'EPR-18-X942': datetime.date(2018, 8, 25)},\n",
       " '8836032-02101': {'EPR-18-X942': datetime.date(2018, 6, 15)},\n",
       " '8841001-00078': {'EPR-18-X684': datetime.date(2018, 8, 25)},\n",
       " '8841002-00079': {'EPR-18-X684': datetime.date(2018, 6, 13)},\n",
       " '8841019-00002': {'EPR-18-X449': datetime.date(2018, 6, 15)},\n",
       " '8842001-00012': {'EPR-18-X688': datetime.date(2018, 8, 25)},\n",
       " '8842002-00020': {'EPR-18-X708': datetime.date(2018, 6, 13)},\n",
       " '8842008-00009': {'EPR-18-X897': datetime.date(2018, 4, 27)},\n",
       " '8842009-00006': {'EPR-18-X707': datetime.date(2018, 6, 15)},\n",
       " '8843002-00012': {'EPR-18-X687': datetime.date(2018, 4, 27)},\n",
       " '8847005-00020': {'EPR-18-X686': datetime.date(2018, 6, 15)},\n",
       " '8847006-00446': {'EPR-18-X1137': datetime.date(2018, 6, 13)},\n",
       " '8848001-00000': {'EPR-18-X0004': datetime.date(2018, 5, 2)},\n",
       " '8848001-09120': {'EPR-18-X1168': datetime.date(2018, 6, 13)},\n",
       " '8860273-24370': {'EPR-18-X1176': datetime.date(2018, 8, 15)},\n",
       " '8860280-90000': {'EPR-18-X700': datetime.date(2018, 6, 15)},\n",
       " '8860301-23560': {'EPR-18-X1086': datetime.date(2018, 5, 2)},\n",
       " '8861011-10330': {'EPR-18-X499': datetime.date(2018, 6, 15)},\n",
       " '8861011-12202': {'EPR-18-X499': datetime.date(2018, 8, 25)},\n",
       " '8862010-05001': {'EPR-18-X935': datetime.date(2018, 8, 15)},\n",
       " '8862010-05002': {'EPR-18-X1584': datetime.date(2018, 8, 25)},\n",
       " '8862028-01010': {'EPR-18-X499': datetime.date(2018, 8, 15)},\n",
       " '8863089-00000': {'EPR-18-X610': datetime.date(2018, 4, 27)},\n",
       " '8863093-00000': {'EPR-18-X499': datetime.date(2018, 5, 2)},\n",
       " '8865059-00000': {'EPR-18-X499': datetime.date(2018, 6, 13)},\n",
       " '8865091-00001': {'EPR-18-X499': datetime.date(2018, 8, 25)},\n",
       " '8866062-00000': {'EPR-18-X499': datetime.date(2018, 8, 25)},\n",
       " '8867021-00024': {'EPR-18-X935': datetime.date(2018, 4, 27)},\n",
       " '8867021-00033': {'EPR-18-X625': datetime.date(2018, 4, 27),\n",
       "  'EPR-18-X935': datetime.date(2018, 6, 13)},\n",
       " '8867591-00470': {'EPR-18-X610': datetime.date(2018, 6, 15)},\n",
       " '8868706-00000': {'EPR-18-X610': datetime.date(2018, 6, 13)},\n",
       " '8868707-00000': {'EPR-18-X572': datetime.date(2018, 8, 25),\n",
       "  'EPR-18-X676': datetime.date(2018, 5, 2)},\n",
       " '8873006-00007': {'EPR-18-X610': datetime.date(2018, 4, 27)},\n",
       " '8873015-00003': {'EPR-18-X610': datetime.date(2018, 8, 15)},\n",
       " '8873099-00005': {'EPR-18-X141': datetime.date(2018, 5, 2)},\n",
       " '8874001-00001': {'EPR-18-X771': datetime.date(2018, 6, 13)},\n",
       " '8875008-00000': {'EPR-18-X759': datetime.date(2018, 8, 25)},\n",
       " '8876039-00000': {'EPR-18-X610': datetime.date(2018, 5, 2)}}"
      ]
     },
     "execution_count": 51,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "get_promise_date()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [],
   "source": [
    "#获取收货日期，并存入字典\n",
    "def get_receive_date():\n",
    "    wb = load_workbook(\"data\\收货信息表.xlsx\")\n",
    "    ws = wb.active\n",
    "    receive_dates = {}\n",
    "    for row in range(2, ws.max_row+1):\n",
    "        part_num= ws['A' + str(row)].value\n",
    "        receive_date = ws['B' + str(row)].value.date()\n",
    "        epr_num = str(ws[\"D\" + str(row)].value)\n",
    "        receive_dates.setdefault(part_num, {})\n",
    "        receive_dates[part_num].setdefault(epr_num, receive_date)\n",
    "    return receive_dates"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "然后我们从“收货信息表”中获取收货日期，并存入字典`receive_dates `，其数据结构和逻辑与获取采购信息一样。调用函数`get_receive_date()`可以看到收货信息的结果如下。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'8836012-04093': {'EPR-18-X942': datetime.date(2018, 8, 28)},\n",
       " '8836032-02101': {'EPR-18-X942': datetime.date(2018, 6, 18)},\n",
       " '8847006-00446': {'EPR-18-X1137': datetime.date(2018, 6, 16)},\n",
       " '8812121-02142': {'EPR-18-X941': datetime.date(2018, 8, 18)},\n",
       " '8860301-23560': {'EPR-18-X1086': datetime.date(2018, 5, 5)},\n",
       " '8868707-00000': {'EPR-18-X572': datetime.date(2018, 8, 28),\n",
       "  'EPR-18-X676': datetime.date(2018, 8, 30)},\n",
       " '60022-5054001': {'EPR-18-X1208': datetime.date(2018, 6, 18)},\n",
       " '8874001-00001': {'EPR-18-X771': datetime.date(2018, 6, 16)},\n",
       " '8862028-01010': {'EPR-18-X499': datetime.date(2018, 8, 18)},\n",
       " '8863093-00000': {'EPR-18-X499': datetime.date(2018, 5, 5)},\n",
       " '8863089-00000': {'EPR-18-X610': datetime.date(2018, 4, 30)},\n",
       " '8866062-00000': {'EPR-18-X499': datetime.date(2018, 8, 28)},\n",
       " '8867591-00470': {'EPR-18-X610': datetime.date(2018, 6, 18)},\n",
       " '8865059-00000': {'EPR-18-X499': datetime.date(2018, 6, 11)},\n",
       " '8865091-00001': {'EPR-18-X499': datetime.date(2018, 8, 23)},\n",
       " '8861011-10330': {'EPR-18-X499': datetime.date(2018, 6, 13)},\n",
       " '8868706-00000': {'EPR-18-X610': datetime.date(2018, 6, 11)},\n",
       " '8873015-00003': {'EPR-18-X610': datetime.date(2018, 8, 13)},\n",
       " '8876039-00000': {'EPR-18-X610': datetime.date(2018, 4, 30)},\n",
       " '8873006-00007': {'EPR-18-X610': datetime.date(2018, 4, 25)},\n",
       " '8861011-12202': {'EPR-18-X499': datetime.date(2018, 8, 23)},\n",
       " '8810140-00042': {'EPR-18-X713': datetime.date(2018, 6, 13)},\n",
       " '8810020-01844': {'EPR-18-X713': datetime.date(2018, 6, 11)},\n",
       " '8810030-01125': {'EPR-18-X713': datetime.date(2018, 8, 13)},\n",
       " '8843002-00012': {'EPR-18-X687': datetime.date(2018, 4, 25)},\n",
       " '8842001-00012': {'EPR-18-X688': datetime.date(2018, 8, 23)},\n",
       " '8841002-00079': {'EPR-18-X684': datetime.date(2018, 6, 11)},\n",
       " '8841001-00078': {'EPR-18-X684': datetime.date(2018, 8, 23)},\n",
       " '8842009-00006': {'EPR-18-X707': datetime.date(2018, 6, 13)},\n",
       " '8842002-00020': {'EPR-18-X708': datetime.date(2018, 6, 11)},\n",
       " '8860273-24370': {'EPR-18-X1176': datetime.date(2018, 8, 13)},\n",
       " '8873099-00005': {'EPR-18-X141': datetime.date(2018, 5, 2)},\n",
       " '8875008-00000': {'EPR-18-X759': datetime.date(2018, 8, 25)},\n",
       " '8860280-90000': {'EPR-18-X700': datetime.date(2018, 6, 15)},\n",
       " '8867021-00033': {'EPR-18-X935': datetime.date(2018, 4, 27)},\n",
       " '8862010-05001': {'EPR-18-X935': datetime.date(2018, 8, 20)},\n",
       " '8867021-00024': {'EPR-18-X935': datetime.date(2018, 5, 2)},\n",
       " '8841019-00002': {'EPR-18-X449': datetime.date(2018, 6, 20)},\n",
       " '8848001-09120': {'EPR-18-X1168': datetime.date(2018, 6, 18)},\n",
       " '8848001-00000': {'EPR-18-X0004': datetime.date(2018, 8, 30)},\n",
       " '8842008-00009': {'EPR-18-X897': datetime.date(2018, 5, 2)}}"
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "get_receive_date()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "处理订单：订单001.xlsx\n",
      "处理订单：订单002.xlsx\n",
      "处理订单：订单003.xlsx\n",
      "处理订单：订单004.xlsx\n",
      "处理订单：订单005.xlsx\n",
      "完成!\n"
     ]
    }
   ],
   "source": [
    "#批量查询并写入采购信息和收货信息\n",
    "from openpyxl import load_workbook\n",
    "from os import walk, path\n",
    "#调用以上两个函数，获取“交期”和“收货日期”\n",
    "promise_date=get_promise_date() #交期\n",
    "receive_date=get_receive_date() #收货日期\n",
    "#遍历“订单”文件夹下的所有excel文件，根据零件编号，EPR编号，写入交期及收货日期，并存为新的文件\n",
    "root = \"data/订单\"\n",
    "for dirpath, dirnames, filenames in walk(root):\n",
    "    for filename in filenames: #依次读取文件夹每个Excel文件进行操作\n",
    "        order = path.join(dirpath, filename)\n",
    "        print(\"处理订单：{}\".format(filename))\n",
    "        wb = load_workbook(order)\n",
    "        ws = wb.active\n",
    "        for row_number in range(5,ws.max_row+1):\n",
    "            part_num = ws.cell(row = row_number, column = 1).value\n",
    "            epr_num = ws.cell(row = row_number,column = 5).value\n",
    "            if epr_num: #判断该行是否有EPR编号，若有，才进一步查询交期及收货日期\n",
    "                # 判断采购信息表里是否有零件号和EPR编号，若都有，则写入其交期\n",
    "                if part_num in promise_date and epr_num in promise_date[part_num]:\n",
    "                    ws.cell(row=row_number,column = 6).value = promise_date[part_num][epr_num]\n",
    "                else:\n",
    "                    #零件号及EPR编号只要有一个未找到，就写入“未采购”\n",
    "                    ws.cell(row=row_number,column = 6).value = \"未采购\"\n",
    "                #判断收货信息表里是否有零件号和EPR编号，若都有则写入收货日期\n",
    "                if part_num in receive_date and epr_num in receive_date[part_num]:\n",
    "                        ws.cell(row=row_number, column=7).value = receive_date[part_num][epr_num]\n",
    "                else:\n",
    "                    #零件号及EPR编号只要有一个未找到，就写入“未收到货”\n",
    "                    ws.cell(row=row_number, column=7).value = \"未收到货\"\n",
    "        #搞定一行之后，在检查下一行，直到这个工作表全部检查完，然后保存，处理下一个Excel表\n",
    "        wb.save('data/订单更新/%s' % filepath)#在新的文件夹中使用原来文件名保存文件\n",
    "print(\"完成!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "采购信息和收货信息获取完成之后，就可以进行批量查询及在订单Excel表中写入数据了。用于获取文件路径的`os`模块，我们在实例2中有用到，可移步此链接查看：https://zhuanlan.zhihu.com/p/60423212 。我们利用它获取所有需要进行处理的Excel文件。此处，我们是将待处理文件存在`data`文件夹下面的`订单`文件夹内的。我们使用`for`循环，依次打开每个文件，然后进行数据的查询和写入。订单Excel文件的内容如下：\n",
    "![](images\\order.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们基于订单表中的零件编号和EPR编号，到采购信息和收货信息里面做匹配，只有当零件编号和EPR编号都匹配的时候，才写入数据。因为订单表从第5行开始才是我们需要匹配的数据，所以从第5行开始逐行查询并写入。我们先看EPR编号是否有内容，如果为空直接跳过，有内容，才进行下一步处理，`if epr_num:`即用于完成该判断。\n",
    "\n",
    "EPR编号有内容，则`epr_num`为True，程序继续执行。我们先看零件号和EPR编号是否都在采购信息里，若是，就在承诺日期列对应的行写入承诺日期，`if part_num in promise_date and epr_num in promise_date[part_num]`。只要“零件号”和“EPR编号”有一个没在采购信息里，就写入“未采购”，即表示采购尚未下单。收货日期也是采用类似的逻辑。直到查询并写入订单表的最后一行，然后保存，并打开下一个文件进行同样的处理。为便于区分及不影响原始数据，我们将更新后的文件存到了“订单更新”文件夹。\n",
    "\n",
    "需要注意的是，在运行程序之前，需要将所有Excel文件关闭，不然可能因为文件处于打开状态，无法存储而报错。最后写好的文件如下所示：\n",
    "![](images\\result.png)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
